Johnson-lindenstrauss Dimensionality Reduction on the Simplex

نویسندگان

  • Rasmus J. Kyng
  • Jeff M. Phillips
  • Suresh Venkatasubramanian
چکیده

We propose an algorithm for dimensionality reduction on the simplex, mapping a set of high-dimensional distributions to a space of lower-dimensional distributions, whilst approximately preserving pairwise Hellinger distance between distributions. By introducing a restriction on the input data to distributions that are in some sense quite smooth, we can map n points on the d-simplex to the simplex of O(ε−2 log n) dimensions with ε-distortion with high probability. The techniques used rely on a classical result by Johnson and Lindenstrauss on dimensionality reduction for Euclidean point sets and require the same number of random bits as non-sparse methods proposed by Achlioptas for database-friendly dimension-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dimensionality Reduction on the Simplex

For many problems in data analysis, the natural way to model objects is as a probability distribution over a finite and discrete domain. Probability distributions over such domains can be represented as points on a (high-dimensional) simplex, and thus many inference questions involving distributions can be viewed geometrically as manipulating points on a simplex. The dimensionality of these poi...

متن کامل

The Fast Johnson-lindenstrauss Transform

While we omit the proof, we remark that it is constructive. Specifically, A is a linear map consisting of random projections onto subspaces of Rd. These projections can be computed by n matrix multiplications, which take time O(nkd). This is fast enough to make the Johnson-Lindenstrauss transform (JLT) a practical and widespread algorithm for dimensionality reduction, which in turn motivates th...

متن کامل

Geometric Optimization April 12 , 2007 Lecture 25 : Johnson Lindenstrauss Lemma

The topic of this lecture is dimensionality reduction. Many problems have been efficiently solved in low dimensions, but very often the solution to low-dimensional spaces are impractical for high dimensional spaces because either space or running time is exponential in dimension. In order to address the curse of dimensionality, one technique is to map a set of points in a high dimensional space...

متن کامل

The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction

For any n > 1 and 0 < ε < 1/2, we show the existence of an n-point subset X of R such that any linear map from (X, `2) to ` m 2 with distortion at most 1 + ε must have m = Ω(min{n, ε−2 logn}). Our lower bound matches the upper bounds provided by the identity matrix and the Johnson-Lindenstrauss lemma [JL84], improving the previous lower bound of Alon [Alo03] by a log(1/ε) factor.

متن کامل

Energy-aware adaptive Johnson-Lindenstrauss embedding via RIP-based designs

We consider a dimensionality reducing matrix design based on training data with constraints on its Frobenius norm and number of rows. Our design criteria is aimed at preserving the distances between the data points in the dimensionality reduced space as much as possible relative to their distances in original data space. This approach can be considered as a deterministic Johnson-Lindenstrauss e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010